Overview

Dataset statistics

Number of variables22
Number of observations1108
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory190.6 KiB
Average record size in memory176.1 B

Variable types

Numeric10
Categorical12

Alerts

Dt_Customer has a high cardinality: 536 distinct values High cardinality
Income is highly correlated with Kidhome and 5 other fieldsHigh correlation
Kidhome is highly correlated with Income and 3 other fieldsHigh correlation
NumWebPurchases is highly correlated with Income and 3 other fieldsHigh correlation
NumCatalogPurchases is highly correlated with Income and 5 other fieldsHigh correlation
NumStorePurchases is highly correlated with Income and 4 other fieldsHigh correlation
NumWebVisitsMonth is highly correlated with Income and 1 other fieldsHigh correlation
target is highly correlated with Income and 4 other fieldsHigh correlation
Income is highly correlated with NumCatalogPurchases and 3 other fieldsHigh correlation
Kidhome is highly correlated with NumCatalogPurchases and 1 other fieldsHigh correlation
NumWebPurchases is highly correlated with NumStorePurchases and 1 other fieldsHigh correlation
NumCatalogPurchases is highly correlated with Income and 4 other fieldsHigh correlation
NumStorePurchases is highly correlated with Income and 3 other fieldsHigh correlation
NumWebVisitsMonth is highly correlated with Income and 1 other fieldsHigh correlation
target is highly correlated with Income and 4 other fieldsHigh correlation
Income is highly correlated with NumCatalogPurchases and 2 other fieldsHigh correlation
Kidhome is highly correlated with NumCatalogPurchasesHigh correlation
NumWebPurchases is highly correlated with NumStorePurchases and 1 other fieldsHigh correlation
NumCatalogPurchases is highly correlated with Income and 3 other fieldsHigh correlation
NumStorePurchases is highly correlated with Income and 3 other fieldsHigh correlation
target is highly correlated with Income and 3 other fieldsHigh correlation
Income is highly correlated with Kidhome and 7 other fieldsHigh correlation
Kidhome is highly correlated with Income and 4 other fieldsHigh correlation
Teenhome is highly correlated with NumDealsPurchasesHigh correlation
NumDealsPurchases is highly correlated with Teenhome and 1 other fieldsHigh correlation
NumWebPurchases is highly correlated with Income and 2 other fieldsHigh correlation
NumCatalogPurchases is highly correlated with Income and 3 other fieldsHigh correlation
NumStorePurchases is highly correlated with Income and 5 other fieldsHigh correlation
NumWebVisitsMonth is highly correlated with Income and 3 other fieldsHigh correlation
AcceptedCmp5 is highly correlated with Income and 2 other fieldsHigh correlation
AcceptedCmp1 is highly correlated with Income and 1 other fieldsHigh correlation
target is highly correlated with Income and 5 other fieldsHigh correlation
id is uniformly distributed Uniform
Dt_Customer is uniformly distributed Uniform
id has unique values Unique
Recency has 15 (1.4%) zeros Zeros
NumDealsPurchases has 27 (2.4%) zeros Zeros
NumWebPurchases has 26 (2.3%) zeros Zeros
NumCatalogPurchases has 279 (25.2%) zeros Zeros

Reproduction

Analysis started2022-05-02 13:53:35.724126
Analysis finished2022-05-02 13:53:42.321675
Duration6.6 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct1108
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean553.5
Minimum0
Maximum1107
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:42.355172image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile55.35
Q1276.75
median553.5
Q3830.25
95-th percentile1051.65
Maximum1107
Range1107
Interquartile range (IQR)553.5

Descriptive statistics

Standard deviation319.9963541
Coefficient of variation (CV)0.5781325278
Kurtosis-1.2
Mean553.5
Median Absolute Deviation (MAD)277
Skewness0
Sum613278
Variance102397.6667
MonotonicityStrictly increasing
2022-05-02T22:53:42.409546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
7371
 
0.1%
7431
 
0.1%
7421
 
0.1%
7411
 
0.1%
7401
 
0.1%
7391
 
0.1%
7381
 
0.1%
7361
 
0.1%
7281
 
0.1%
Other values (1098)1098
99.1%
ValueCountFrequency (%)
01
0.1%
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
ValueCountFrequency (%)
11071
0.1%
11061
0.1%
11051
0.1%
11041
0.1%
11031
0.1%
11021
0.1%
11011
0.1%
11001
0.1%
10991
0.1%
10981
0.1%

Year_Birth
Real number (ℝ≥0)

Distinct57
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1968.701264
Minimum1893
Maximum1996
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:42.460525image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1893
5-th percentile1949
Q11959
median1970
Q31977
95-th percentile1988
Maximum1996
Range103
Interquartile range (IQR)18

Descriptive statistics

Standard deviation12.22537967
Coefficient of variation (CV)0.006209870383
Kurtosis1.18745119
Mean1968.701264
Median Absolute Deviation (MAD)9
Skewness-0.439100387
Sum2181321
Variance149.4599081
MonotonicityNot monotonic
2022-05-02T22:53:42.508053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
197146
 
4.2%
197642
 
3.8%
197042
 
3.8%
197341
 
3.7%
197539
 
3.5%
197838
 
3.4%
196936
 
3.2%
196535
 
3.2%
197234
 
3.1%
195831
 
2.8%
Other values (47)724
65.3%
ValueCountFrequency (%)
18931
 
0.1%
19001
 
0.1%
19401
 
0.1%
19411
 
0.1%
19435
0.5%
19444
 
0.4%
19455
0.5%
194610
0.9%
19478
0.7%
194812
1.1%
ValueCountFrequency (%)
19961
 
0.1%
19953
 
0.3%
19933
 
0.3%
19926
 
0.5%
19914
 
0.4%
19909
 
0.8%
198921
1.9%
198815
1.4%
198718
1.6%
198626
2.3%

Education
Categorical

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
Graduation
570 
PhD
254 
Master
173 
2n Cycle
89 
Basic
 
22

Length

Max length10
Median length10
Mean length7.510830325
Min length3

Characters and Unicode

Total characters8322
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMaster
2nd rowGraduation
3rd rowGraduation
4th rowBasic
5th rowPhD

Common Values

ValueCountFrequency (%)
Graduation570
51.4%
PhD254
22.9%
Master173
 
15.6%
2n Cycle89
 
8.0%
Basic22
 
2.0%

Length

2022-05-02T22:53:42.552968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:42.594684image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
graduation570
47.6%
phd254
21.2%
master173
 
14.5%
2n89
 
7.4%
cycle89
 
7.4%
basic22
 
1.8%

Most occurring characters

ValueCountFrequency (%)
a1335
16.0%
r743
8.9%
t743
8.9%
n659
 
7.9%
i592
 
7.1%
G570
 
6.8%
d570
 
6.8%
u570
 
6.8%
o570
 
6.8%
e262
 
3.1%
Other values (12)1708
20.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6782
81.5%
Uppercase Letter1362
 
16.4%
Decimal Number89
 
1.1%
Space Separator89
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1335
19.7%
r743
11.0%
t743
11.0%
n659
9.7%
i592
8.7%
d570
8.4%
u570
8.4%
o570
8.4%
e262
 
3.9%
h254
 
3.7%
Other values (4)484
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
G570
41.9%
D254
18.6%
P254
18.6%
M173
 
12.7%
C89
 
6.5%
B22
 
1.6%
Decimal Number
ValueCountFrequency (%)
289
100.0%
Space Separator
ValueCountFrequency (%)
89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8144
97.9%
Common178
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1335
16.4%
r743
9.1%
t743
9.1%
n659
8.1%
i592
 
7.3%
G570
 
7.0%
d570
 
7.0%
u570
 
7.0%
o570
 
7.0%
e262
 
3.2%
Other values (10)1530
18.8%
Common
ValueCountFrequency (%)
289
50.0%
89
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1335
16.0%
r743
8.9%
t743
8.9%
n659
 
7.9%
i592
 
7.1%
G570
 
6.8%
d570
 
6.8%
u570
 
6.8%
o570
 
6.8%
e262
 
3.1%
Other values (12)1708
20.5%

Marital_Status
Categorical

Distinct8
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
Married
415 
Together
296 
Single
234 
Divorced
120 
Widow
 
39
Other values (3)
 
4

Length

Max length8
Median length7
Mean length7.086642599
Min length4

Characters and Unicode

Total characters7852
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st rowTogether
2nd rowSingle
3rd rowMarried
4th rowMarried
5th rowTogether

Common Values

ValueCountFrequency (%)
Married415
37.5%
Together296
26.7%
Single234
21.1%
Divorced120
 
10.8%
Widow39
 
3.5%
Alone2
 
0.2%
YOLO1
 
0.1%
Absurd1
 
0.1%

Length

2022-05-02T22:53:42.637116image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:42.683971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
married415
37.5%
together296
26.7%
single234
21.1%
divorced120
 
10.8%
widow39
 
3.5%
alone2
 
0.2%
yolo1
 
0.1%
absurd1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e1363
17.4%
r1247
15.9%
i808
10.3%
d575
 
7.3%
g530
 
6.7%
o457
 
5.8%
M415
 
5.3%
a415
 
5.3%
T296
 
3.8%
t296
 
3.8%
Other values (16)1450
18.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6741
85.9%
Uppercase Letter1111
 
14.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1363
20.2%
r1247
18.5%
i808
12.0%
d575
8.5%
g530
 
7.9%
o457
 
6.8%
a415
 
6.2%
t296
 
4.4%
h296
 
4.4%
n236
 
3.5%
Other values (7)518
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
M415
37.4%
T296
26.6%
S234
21.1%
D120
 
10.8%
W39
 
3.5%
A3
 
0.3%
O2
 
0.2%
Y1
 
0.1%
L1
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin7852
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1363
17.4%
r1247
15.9%
i808
10.3%
d575
 
7.3%
g530
 
6.7%
o457
 
5.8%
M415
 
5.3%
a415
 
5.3%
T296
 
3.8%
t296
 
3.8%
Other values (16)1450
18.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII7852
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1363
17.4%
r1247
15.9%
i808
10.3%
d575
 
7.3%
g530
 
6.7%
o457
 
5.8%
M415
 
5.3%
a415
 
5.3%
T296
 
3.8%
t296
 
3.8%
Other values (16)1450
18.5%

Income
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1031
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52075.80957
Minimum1730
Maximum162397
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:42.732161image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1730
5-th percentile19544.85
Q135768.5
median51609.5
Q368325
95-th percentile83834.2
Maximum162397
Range160667
Interquartile range (IQR)32556.5

Descriptive statistics

Standard deviation21310.0934
Coefficient of variation (CV)0.4092129066
Kurtosis0.602284487
Mean52075.80957
Median Absolute Deviation (MAD)16278.5
Skewness0.2916339678
Sum57699997
Variance454120080.5
MonotonicityNot monotonic
2022-05-02T22:53:42.781084image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75009
 
0.8%
539772
 
0.2%
705962
 
0.2%
328922
 
0.2%
330392
 
0.2%
604742
 
0.2%
801242
 
0.2%
341762
 
0.2%
389462
 
0.2%
828002
 
0.2%
Other values (1021)1081
97.6%
ValueCountFrequency (%)
17301
 
0.1%
40231
 
0.1%
48611
 
0.1%
53051
 
0.1%
68351
 
0.1%
71441
 
0.1%
75009
0.8%
89401
 
0.1%
97221
 
0.1%
102451
 
0.1%
ValueCountFrequency (%)
1623971
0.1%
1577331
0.1%
1539241
0.1%
1137341
0.1%
1019701
0.1%
987772
0.2%
968761
0.1%
968431
0.1%
948711
0.1%
946421
0.1%

Kidhome
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
661 
1
418 
2
 
29

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row1
5th row2

Common Values

ValueCountFrequency (%)
0661
59.7%
1418
37.7%
229
 
2.6%

Length

2022-05-02T22:53:42.826416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:42.861659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0661
59.7%
1418
37.7%
229
 
2.6%

Most occurring characters

ValueCountFrequency (%)
0661
59.7%
1418
37.7%
229
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0661
59.7%
1418
37.7%
229
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0661
59.7%
1418
37.7%
229
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0661
59.7%
1418
37.7%
229
 
2.6%

Teenhome
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
571 
1
507 
2
 
30

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0571
51.5%
1507
45.8%
230
 
2.7%

Length

2022-05-02T22:53:42.892817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:42.928435image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0571
51.5%
1507
45.8%
230
 
2.7%

Most occurring characters

ValueCountFrequency (%)
0571
51.5%
1507
45.8%
230
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0571
51.5%
1507
45.8%
230
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0571
51.5%
1507
45.8%
230
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0571
51.5%
1507
45.8%
230
 
2.7%

Dt_Customer
Categorical

HIGH CARDINALITY
UNIFORM

Distinct536
Distinct (%)48.4%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
28-10-2013
 
7
03-06-2013
 
7
10-01-2013
 
6
20-08-2013
 
6
31-08-2012
 
6
Other values (531)
1076 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters11080
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique201 ?
Unique (%)18.1%

Sample

1st row21-01-2013
2nd row24-05-2014
3rd row08-04-2013
4th row29-03-2014
5th row10-06-2014

Common Values

ValueCountFrequency (%)
28-10-20137
 
0.6%
03-06-20137
 
0.6%
10-01-20136
 
0.5%
20-08-20136
 
0.5%
31-08-20126
 
0.5%
30-12-20125
 
0.5%
07-08-20135
 
0.5%
11-05-20135
 
0.5%
18-05-20145
 
0.5%
22-08-20125
 
0.5%
Other values (526)1051
94.9%

Length

2022-05-02T22:53:42.959850image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
28-10-20137
 
0.6%
03-06-20137
 
0.6%
10-01-20136
 
0.5%
20-08-20136
 
0.5%
31-08-20126
 
0.5%
20-02-20135
 
0.5%
22-05-20135
 
0.5%
13-01-20135
 
0.5%
29-05-20145
 
0.5%
06-05-20135
 
0.5%
Other values (526)1051
94.9%

Most occurring characters

ValueCountFrequency (%)
02467
22.3%
-2216
20.0%
12088
18.8%
22032
18.3%
3890
 
8.0%
4426
 
3.8%
8220
 
2.0%
5204
 
1.8%
9191
 
1.7%
6183
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8864
80.0%
Dash Punctuation2216
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02467
27.8%
12088
23.6%
22032
22.9%
3890
 
10.0%
4426
 
4.8%
8220
 
2.5%
5204
 
2.3%
9191
 
2.2%
6183
 
2.1%
7163
 
1.8%
Dash Punctuation
ValueCountFrequency (%)
-2216
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common11080
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02467
22.3%
-2216
20.0%
12088
18.8%
22032
18.3%
3890
 
8.0%
4426
 
3.8%
8220
 
2.0%
5204
 
1.8%
9191
 
1.7%
6183
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII11080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02467
22.3%
-2216
20.0%
12088
18.8%
22032
18.3%
3890
 
8.0%
4426
 
3.8%
8220
 
2.0%
5204
 
1.8%
9191
 
1.7%
6183
 
1.7%

Recency
Real number (ℝ≥0)

ZEROS

Distinct100
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.15613718
Minimum0
Maximum99
Zeros15
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:43.001499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q125
median51
Q376
95-th percentile94
Maximum99
Range99
Interquartile range (IQR)51

Descriptive statistics

Standard deviation29.08558204
Coefficient of variation (CV)0.5799007593
Kurtosis-1.207803544
Mean50.15613718
Median Absolute Deviation (MAD)26
Skewness-0.06130958542
Sum55573
Variance845.9710824
MonotonicityNot monotonic
2022-05-02T22:53:43.049771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5620
 
1.8%
8418
 
1.6%
8118
 
1.6%
8718
 
1.6%
4917
 
1.5%
317
 
1.5%
7717
 
1.5%
2517
 
1.5%
5117
 
1.5%
9217
 
1.5%
Other values (90)932
84.1%
ValueCountFrequency (%)
015
1.4%
18
0.7%
211
1.0%
317
1.5%
414
1.3%
58
0.7%
68
0.7%
76
 
0.5%
813
1.2%
914
1.3%
ValueCountFrequency (%)
9911
1.0%
9810
0.9%
979
0.8%
9613
1.2%
957
0.6%
9416
1.4%
9312
1.1%
9217
1.5%
9110
0.9%
9012
1.1%

NumDealsPurchases
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct15
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.339350181
Minimum0
Maximum15
Zeros27
Zeros (%)2.4%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:43.089958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile6
Maximum15
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.94327979
Coefficient of variation (CV)0.8306921326
Kurtosis7.729947829
Mean2.339350181
Median Absolute Deviation (MAD)1
Skewness2.264245415
Sum2592
Variance3.776336343
MonotonicityNot monotonic
2022-05-02T22:53:43.123877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1478
43.1%
2238
21.5%
3143
 
12.9%
499
 
8.9%
548
 
4.3%
629
 
2.6%
027
 
2.4%
721
 
1.9%
89
 
0.8%
105
 
0.5%
Other values (5)11
 
1.0%
ValueCountFrequency (%)
027
 
2.4%
1478
43.1%
2238
21.5%
3143
 
12.9%
499
 
8.9%
548
 
4.3%
629
 
2.6%
721
 
1.9%
89
 
0.8%
93
 
0.3%
ValueCountFrequency (%)
153
 
0.3%
131
 
0.1%
121
 
0.1%
113
 
0.3%
105
 
0.5%
93
 
0.3%
89
 
0.8%
721
1.9%
629
2.6%
548
4.3%

NumWebPurchases
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct14
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.184115523
Minimum0
Maximum27
Zeros26
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:43.159605image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q36
95-th percentile9
Maximum27
Range27
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.810555731
Coefficient of variation (CV)0.6717203947
Kurtosis4.924827709
Mean4.184115523
Median Absolute Deviation (MAD)2
Skewness1.289607029
Sum4636
Variance7.899223517
MonotonicityNot monotonic
2022-05-02T22:53:43.195479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3179
16.2%
1166
15.0%
2165
14.9%
4142
12.8%
5110
9.9%
695
8.6%
779
7.1%
859
 
5.3%
934
 
3.1%
1027
 
2.4%
Other values (4)52
 
4.7%
ValueCountFrequency (%)
026
 
2.3%
1166
15.0%
2165
14.9%
3179
16.2%
4142
12.8%
5110
9.9%
695
8.6%
779
7.1%
859
 
5.3%
934
 
3.1%
ValueCountFrequency (%)
271
 
0.1%
231
 
0.1%
1124
 
2.2%
1027
 
2.4%
934
 
3.1%
859
5.3%
779
7.1%
695
8.6%
5110
9.9%
4142
12.8%

NumCatalogPurchases
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct12
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.690433213
Minimum0
Maximum11
Zeros279
Zeros (%)25.2%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:43.345756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q34
95-th percentile9
Maximum11
Range11
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.792236397
Coefficient of variation (CV)1.037838956
Kurtosis0.3807830549
Mean2.690433213
Median Absolute Deviation (MAD)2
Skewness1.099499113
Sum2981
Variance7.796584094
MonotonicityNot monotonic
2022-05-02T22:53:43.378510image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0279
25.2%
1248
22.4%
2141
12.7%
495
 
8.6%
383
 
7.5%
576
 
6.9%
660
 
5.4%
737
 
3.3%
1031
 
2.8%
827
 
2.4%
Other values (2)31
 
2.8%
ValueCountFrequency (%)
0279
25.2%
1248
22.4%
2141
12.7%
383
 
7.5%
495
 
8.6%
576
 
6.9%
660
 
5.4%
737
 
3.3%
827
 
2.4%
922
 
2.0%
ValueCountFrequency (%)
119
 
0.8%
1031
 
2.8%
922
 
2.0%
827
 
2.4%
737
 
3.3%
660
5.4%
576
6.9%
495
8.6%
383
7.5%
2141
12.7%

NumStorePurchases
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct14
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.905234657
Minimum0
Maximum13
Zeros6
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:43.413513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q13
median5
Q38
95-th percentile12
Maximum13
Range13
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.306811786
Coefficient of variation (CV)0.5599797431
Kurtosis-0.758848672
Mean5.905234657
Median Absolute Deviation (MAD)2
Skewness0.6536889162
Sum6543
Variance10.93500419
MonotonicityNot monotonic
2022-05-02T22:53:43.449644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3233
21.0%
4163
14.7%
2108
9.7%
5107
9.7%
680
 
7.2%
873
 
6.6%
767
 
6.0%
1060
 
5.4%
1259
 
5.3%
957
 
5.1%
Other values (4)101
9.1%
ValueCountFrequency (%)
06
 
0.5%
14
 
0.4%
2108
9.7%
3233
21.0%
4163
14.7%
5107
9.7%
680
 
7.2%
767
 
6.0%
873
 
6.6%
957
 
5.1%
ValueCountFrequency (%)
1341
 
3.7%
1259
 
5.3%
1150
 
4.5%
1060
 
5.4%
957
 
5.1%
873
6.6%
767
6.0%
680
7.2%
5107
9.7%
4163
14.7%

NumWebVisitsMonth
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct15
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.348375451
Minimum0
Maximum20
Zeros5
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:43.485328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q37
95-th percentile8
Maximum20
Range20
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.405114821
Coefficient of variation (CV)0.4496907226
Kurtosis2.346720892
Mean5.348375451
Median Absolute Deviation (MAD)2
Skewness0.2990003964
Sum5926
Variance5.784577304
MonotonicityNot monotonic
2022-05-02T22:53:43.520690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
7188
17.0%
6177
16.0%
8169
15.3%
5139
12.5%
3112
10.1%
4109
9.8%
294
8.5%
167
 
6.0%
942
 
3.8%
05
 
0.5%
Other values (5)6
 
0.5%
ValueCountFrequency (%)
05
 
0.5%
167
 
6.0%
294
8.5%
3112
10.1%
4109
9.8%
5139
12.5%
6177
16.0%
7188
17.0%
8169
15.3%
942
 
3.8%
ValueCountFrequency (%)
202
 
0.2%
191
 
0.1%
141
 
0.1%
131
 
0.1%
101
 
0.1%
942
 
3.8%
8169
15.3%
7188
17.0%
6177
16.0%
5139
12.5%

AcceptedCmp3
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
1031 
1
 
77

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01031
93.1%
177
 
6.9%

Length

2022-05-02T22:53:43.559700image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:43.593951image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01031
93.1%
177
 
6.9%

Most occurring characters

ValueCountFrequency (%)
01031
93.1%
177
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01031
93.1%
177
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01031
93.1%
177
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01031
93.1%
177
 
6.9%

AcceptedCmp4
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
1013 
1
 
95

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01013
91.4%
195
 
8.6%

Length

2022-05-02T22:53:43.623158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:43.659455image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01013
91.4%
195
 
8.6%

Most occurring characters

ValueCountFrequency (%)
01013
91.4%
195
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01013
91.4%
195
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01013
91.4%
195
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01013
91.4%
195
 
8.6%

AcceptedCmp5
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
1028 
1
 
80

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01028
92.8%
180
 
7.2%

Length

2022-05-02T22:53:43.696451image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:43.738845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01028
92.8%
180
 
7.2%

Most occurring characters

ValueCountFrequency (%)
01028
92.8%
180
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01028
92.8%
180
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01028
92.8%
180
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01028
92.8%
180
 
7.2%

AcceptedCmp1
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
1032 
1
 
76

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
01032
93.1%
176
 
6.9%

Length

2022-05-02T22:53:43.770845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:43.807836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01032
93.1%
176
 
6.9%

Most occurring characters

ValueCountFrequency (%)
01032
93.1%
176
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01032
93.1%
176
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01032
93.1%
176
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01032
93.1%
176
 
6.9%

AcceptedCmp2
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
1091 
1
 
17

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01091
98.5%
117
 
1.5%

Length

2022-05-02T22:53:43.838759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:43.874574image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01091
98.5%
117
 
1.5%

Most occurring characters

ValueCountFrequency (%)
01091
98.5%
117
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01091
98.5%
117
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01091
98.5%
117
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01091
98.5%
117
 
1.5%

Complain
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
1098 
1
 
10

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01098
99.1%
110
 
0.9%

Length

2022-05-02T22:53:43.905540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:43.940357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01098
99.1%
110
 
0.9%

Most occurring characters

ValueCountFrequency (%)
01098
99.1%
110
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01098
99.1%
110
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01098
99.1%
110
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01098
99.1%
110
 
0.9%

Response
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.8 KiB
0
951 
1
157 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1108
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0951
85.8%
1157
 
14.2%

Length

2022-05-02T22:53:43.969662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-02T22:53:44.003935image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0951
85.8%
1157
 
14.2%

Most occurring characters

ValueCountFrequency (%)
0951
85.8%
1157
 
14.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1108
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0951
85.8%
1157
 
14.2%

Most occurring scripts

ValueCountFrequency (%)
Common1108
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0951
85.8%
1157
 
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0951
85.8%
1157
 
14.2%

target
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct694
Distinct (%)62.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean617.1218412
Minimum6
Maximum2525
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.8 KiB
2022-05-02T22:53:44.040012image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile22
Q170.75
median412
Q31068.75
95-th percentile1786.55
Maximum2525
Range2519
Interquartile range (IQR)998

Descriptive statistics

Standard deviation603.5879717
Coefficient of variation (CV)0.978069372
Kurtosis-0.4454106947
Mean617.1218412
Median Absolute Deviation (MAD)367
Skewness0.817892666
Sum683771
Variance364318.4395
MonotonicityNot monotonic
2022-05-02T22:53:44.087692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4613
 
1.2%
2211
 
1.0%
488
 
0.7%
158
 
0.7%
378
 
0.7%
577
 
0.6%
417
 
0.6%
207
 
0.6%
457
 
0.6%
387
 
0.6%
Other values (684)1025
92.5%
ValueCountFrequency (%)
62
 
0.2%
82
 
0.2%
91
 
0.1%
103
 
0.3%
111
 
0.1%
133
 
0.3%
141
 
0.1%
158
0.7%
164
0.4%
176
0.5%
ValueCountFrequency (%)
25251
0.1%
24401
0.1%
23021
0.1%
22791
0.1%
22571
0.1%
22521
0.1%
22171
0.1%
22111
0.1%
21941
0.1%
21531
0.1%

Interactions

2022-05-02T22:53:41.621307image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.009800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.512836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.298397image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.777836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.213849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.744775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.182290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.622561image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.066842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.663352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.052223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.554949image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.346844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.819780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.256754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.787169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.224938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.664301image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.111566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.705848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.099933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.598252image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.398540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.862708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.301031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.830831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.268479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.707991image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.158328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.752102image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.160419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.645479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.449938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.908674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.349615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.876945image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.316184image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.755491image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.207165image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.795527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.226456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.966218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.496535image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.950403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.394129image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.919650image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.359215image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.799703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.252518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.841427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.286551image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.012550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.544687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.996211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.439561image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.964251image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.404464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.845186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.389471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.887112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.330353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.056729image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.589904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.037928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.484216image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.006881image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.446640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.888676image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.434143image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.934228image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.375241image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.103418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.635871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.080791image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.606446image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.049725image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.489137image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.931341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.479449image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.980003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.420765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.174508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.683199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.124153image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.651888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.094239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.532367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.975562image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.527179image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:42.030480image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:37.469910image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.242519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:38.732531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.170901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:39.700901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.140228image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:40.579088image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.023629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-02T22:53:41.577000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-05-02T22:53:44.134454image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-02T22:53:44.217658image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-02T22:53:44.301422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-02T22:53:44.379871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-05-02T22:53:44.445865image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-02T22:53:42.114464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-02T22:53:42.272847image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

idYear_BirthEducationMarital_StatusIncomeKidhomeTeenhomeDt_CustomerRecencyNumDealsPurchasesNumWebPurchasesNumCatalogPurchasesNumStorePurchasesNumWebVisitsMonthAcceptedCmp3AcceptedCmp4AcceptedCmp5AcceptedCmp1AcceptedCmp2ComplainResponsetarget
001974MasterTogether46014.01121-01-2013211071870000000541
111962GraduationSingle76624.00124-05-2014681510711000000899
221951GraduationMarried75903.00108-04-201350266930000000901
331974BasicMarried18393.01029-03-2014223038000000050
441946PhDTogether64014.02110-06-201456782570001000444
551952GraduationSingle47958.00119-01-20138263550000000407
661971GraduationSingle22804.01031-07-20137512029000000026
771978GraduationWidow54162.01118-03-20133111034000000042
881968GraduationMarried45688.00125-01-201420231840100000306
991952GraduationSingle61823.00118-02-2013264821070000000884

Last rows

idYear_BirthEducationMarital_StatusIncomeKidhomeTeenhomeDt_CustomerRecencyNumDealsPurchasesNumWebPurchasesNumCatalogPurchasesNumStorePurchasesNumWebVisitsMonthAcceptedCmp3AcceptedCmp4AcceptedCmp5AcceptedCmp1AcceptedCmp2ComplainResponsetarget
109810981970PhDMarried23626.01024-05-20148433135000000043
109910991973PhDMarried85844.00029-05-2014621667200100001958
110011001956MasterSingle55284.00124-12-201260375850000000764
110111011971MasterDivorced42835.01130-06-201364766460000000595
110211021946PhDSingle82800.00024-11-20122317612300110011315
110311031956GraduationTogether46097.00131-03-201311531640000000241
110411041986GraduationMarried23477.01021-10-201339330480000000147
110511051975MasterMarried37368.01016-12-2013411026100000030
110611061974GraduationDivorced53034.01130-05-201330861780000000447
110711071952PhDDivorced46610.00229-10-20128641660000001302